Chine Listening - Validation on Animal Calls and Speech
نویسندگان
چکیده
The scattering framework offers an optimal hierarchical convolutional decomposition according to its kernels. Convolutional Neural Net (CNN) can be seen as an optimal kernel decomposition, nevertheless it requires large amount of training data to learn its kernels. We propose a trade-off between these two approaches: a Chirplet kernel as an efficient Q constant bioacoustic representation to pretrain CNN. First we motivate Chirplet bioinspired auditory representation. Second we give the first algorithm (and code) of a Fast Chirplet Transform (FCT). Third, we demonstrate the computation efficiency of FCT on large environmental data base: months of Orca recordings, and 1000 Birds species from the LifeClef challenge. Fourth, we validate FCT on the vowels subset of the Speech TIMIT dataset. The results show that FCT accelerates CNN when it pretrains low level layers: it reduces training duration by -28% for birds classification, and by -26% for vowels classification. Scores are also enhanced by FCT pretraining, with a relative gain of +7.8% of Mean Average Precision on birds, and +2.3% of vowel accuracy against raw audio CNN. We conclude on perspectives on tonotopic FCT deep machine listening, and inter-species bioacoustic transfer learning to generalise the representation of animal communication systems.
منابع مشابه
On the Evaluation of the Conversational Speech Quality in Telecommunications
We propose an objective method to assess speech quality in the conversational context by taking into account the talking and listening speech qualities and the impact of delay. This approach is applied to the results of four subjective tests on the effects of echo, delay, packet loss, and noise. The dataset is divided into training and validation sets. For the training set, a multiple linear re...
متن کاملExamining the Association between T-unit and Pausing Length on the EFL Perception of Listening Comprehension
Listening taking over half of the learners’ time and effort (Nunan, 1998), forms a basis for acquiring much of a language. There are factors affecting listening comprehension and its perception, such as the speech rate, phonological properties of the text, the quality of the recording, the learners’ anxiety, and listening comprehension strategies (Goh, 2000; Hamouda, 2013). At the Iran Language...
متن کاملAn Investigation into the Effect of Authentic Materials on Iranian EFL Learners’ English Listening Comprehension
Listening comprehension means the process of understanding speech in a second or foreign language. It is the perception of information and stimuli received through the ears. Iranian EFL learners have serious difficulties in English listening comprehension because Iranian universities pay more attention to grammar and reading. The researchers used a set of authentic materials to improve Iranian ...
متن کاملFree Field Word recognition test in the presence of noise in normal hearing adults.
INTRODUCTION In ideal listening situations, subjects with normal hearing can easily understand speech, as can many subjects who have a hearing loss. OBJECTIVE To present the validation of the Word Recognition Test in a Free Field in the Presence of Noise in normal-hearing adults. METHODS Sample consisted of 100 healthy adults over 18 years of age with normal hearing. After pure tone audiome...
متن کاملTeaching approaches to Computer Assisted Language Learning
Computers have been used for language teaching ever since the 1960's.Learning a second language is a challenging endeavor, and, for decades now, proponents of computer assisted language learning (CALL) have declared that help is on the horison. We investigate the suitability of deploying speech technology in computer based systems that can be used to teach foreign language skills. In this case,...
متن کامل